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Computer systems can be effectively divided into three parts: information entry, 

information structuring, and Information retrieval. In a document surrogate retrieval 
system, each document requires most of the following during entry: acquisition; 

cataloging, abstracting; indexing; generation of a machine- readable record; validation 
and editing; and preparation of records for retrieval. Information structuring, if 
present, supports all users by reducing the cost of searching with the use of the 
following: data compressions; reorganization of data for efficient access; and/or 
maintenance of auxiliary files which assist retrieval and publications. For every 
query, retrieval Involves repeated steps which include: familiarization with user's 

needs; entry of request to the computer; access of the data base; and display of 
document and/or surrogates to the user. 


As noted, entry cost is incurred for every document and retrieval cost is incurred 
for each query. Costs are divided between human and machine with entry and retrieval 
being mainly human costs. Since computer costs are decreasing and personnel costs 
are increasing, the cost/effectiveness of work is shifting in favor cf having the 
machine accomplish more. Structuring costs are mainly computer costs with human costs 
limited to programming and vocabulary maintenance. Some structuring techniques have 
resulted in substantial decreases in entry and retrieval costs. 

Natural language techniques work quite satisfactorily for those systems which are 
just concerned with information retrieval. For these systems that additionally require 
printed indexes of subject terms or possible thesaurus controlled terms, natural 
language indexing alone cannot accomplish the task. Satisfactory print terms of the 
appropriate caliber and number can be generated in a primarily automatic manner. 

The NASA Scientific and Technical Information Facility is actively investigating methods 
of selecting multi-word index terns (phrases) for wideiy distributed abstract journals 
which will require minimal human effort to index each abstract while taking maximum 
advantage of computer Index selections. Several experiments are planned to investigate 
the adequacy defined below relative to the present method that is based on professional 
Indexers. The present computer algorithm generates index phrases from each abstract 
by: deletion of mundane words, look uo of remaining word strings in the NASA Thesaurus 

(controlled) to determine the closest entry phrase, ranking the corresponding "use" 
phrase by the stored human assessment of the phrases usefulness, "boosting" that 
assessment by the assess: .^nt of all lower ranked phrases with word stems in common 
with a given phrase, reranking, and finally, the selection of the appropriate number • 

of important phrases. Should there be too few adequately important terms, or tco 
many of similar Importance, the surrogate can be markpd for manual consideration, 
resulting in modification of the thesaurus or of value which govern the print term 
selection. Lexicc graphic control of the thesaurus will maintain the system with 
respect to new terminology as well as typographic anomolies. 
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The NASA Automatic Subject Analysis Technique for Extracting Retrievable Multi-terms 
(NASA TERM) System is a computer coordinated indexing approach for publications and 
document surrogate retrieval systems from natural language abstracts. The NASA TERM 
System is in Its embryonic state of developmenc having recently completed its 
feasibility study phase. This paper will present some background information, the 
description of the planned NASA TERM System, and the itemization of research and 
development experiments. 

Z. Background 


2. 1 Parts of A Typical Document Retrieval System 

There are two main types of information retrieval systems: data retrieval 

systems and document retrieval systems. The response of a data retrieval 
system has the pattern "X is Y"; the response of a document retrieval system 
has the pattern "the item Z discusses topic X". For example, a data retrieval 
system Is built to answer questions like: "What is the melting point of 
titanium steel?"; whereas a document retrieval system is built to answer 
questions like: "Which documents discuss the melting points of titanium 
steels?" This difference is quite significant in that a data retrieval 
system must understand a user's requirement at a much deeper level and respond 
with facts; a document retrieval system need only supply the bibliographic 
reference (accession number) of pertinent documents. 

Most data retrieval systems are based on sophisticated data bases loaded 
and searched by experts using highly structured queries. Document retrieval 
systems need not be so sophisticated; in fact, they may perform better if 
not overly complex. One pattern detected throughout an analysis of ongoing 
document retrieval systems is that sophisticated techniques help some users 
and hurt others; the problem is knowing which users will be helped and which 
hurt. 

Most computer systems, including document retrieval systems, can be effectively 
divided into three parts, as shown ir. Table l-l : information entry, informa- 

tion structuring, and information retrieval. 

2.1.1 Information Entry 


In a document retrieval system, each new document requires the performance 
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ENTRY 


Phases Found tn Many Document Reuievai Systems 

STRUCTURING 


RETRIEVAL 


For each document Individually For document or queries tn general For each query individual 


Acquisition 

Cataloging 

Abstract review/preparation 

Selection of Print Terms 

Selection of Online Terms 

Entry of information to 
Computer 

Verification of entered data 
Conversion to storage format 


Analysis of new terms and phrases Conversion of entry text 

to Internal form 

Creation/extension of Inverted file Display of terms related 

terms In original query 
Creation/extension of cluster Display of search statis- 

Centroids tics; e.g., number of 

documents Isolated 
Display of citation and 
abstracts 

Comparison of queries witr 
centroids or documents 
Concatenation and Interse 
tion of Inversion lists 
one operation per 
‘ significant term Ir 

expanded query 


Table 1-1 


| 

i 

2.1.1 Information Entry (cont'd) 

of most of the following component steps during data entry: acquisition; 

cataloging; abstracting; indexing; entry of cataloging, abstracting and 
Indexing Information to the computer; computer verification of the adequacy 
of data about the document; conversion of entered data from entry standard 
"codes", etc. to retrieval or display standards; and, finally, the prepara- 
tion of the document record fc r retrieval. 

j 

2.1.2 Information Structuring i 

Information structuring includes those steps applied to surrogates collec- 
tively. In some systems, there Is no structuring at all; the data records 
produced by the information entry steps are read one after the other and 
"compared" to a usar's request. Such a technique is exhorbi tantly expensive 
If there are very many documents or queries. Information structuring is a 
"front money" cost that increases the base cost of a system. This investment 
Is made In order to substantially decrease the cost of data searching and/or 
Information entry; l.e., to decrease costs associated with every query and/or 
every document. 

Typical steps in Information structuring may include: (1) data compression 

such as the conversion of character strings to numbers to decrease the 
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amount of storage required; (2) reorganization of data from entry order to 
an order supportive of faster or less expensive searching; (3) generation 
and maintenance of auxiliary files which assist retrieval; and ( k ) publica- 
tion of tools to be used for many searches. 

2.1.3 Information Retrieval 


Retrieval Involves all steps accomplished for each Individual query. These 
steps include expert familiarization with user's needs, entry of a request 
to the computer, access of the data base to locate names of documents which 
will, hopefully, be pertinent, and display of document names (and often 
surrogates) to the document retrieval specialist or to the user. The 
process is repeated for each query. 

2.1.4 Economic Tradeoffs 


As noted above, the cost of data entry is Incurred for every new document; 
the cost of retrieval is incurred for every query. No structuring is 
required to support document retrieval; however, some structuring techniques 
have resulted In very substantial decreases in entry and/or retrieval costs. 
The primary management problem is to decide if an investment in structuring 
because of the expected payoff In reduced entry and retrieval costs Is 
justified. 

Costs can be divided into two types: machine and human. Structuring costs 

are mainly computer costs with human costs limited to such Items as pro- 
gramming and to the addition of new terms and phrases to a thesaurus. Entry 
and retrieval costs are mainly human costs (with some machine costs in support) 
These costs are Incurred for every document or individual query. The fact 
that computer costs are decreasing about 20% per year and personnel costs 
are rising about 20% per year means that the cost/effectiveness of work 
performed by the computer relative to work performed by the person is constant! 
shifting in the favor of having the computer do more and more. 

The question of the effectiveness of computer processing versus human pro- 
cessing on a strict performance level also needs careful consideration. 

There is a very human tendency to presume that a computer is unable to perform 
certain tasks "which everyone knows only humans have the ability to perform." 
All clai ms that humans can perform given tasks better or more cost-ef fecti veiy 
than computers, or vice-versa, should be considered suspect without specific 
scientific evidence. In general, one can say that most claims for humans 
being better than computers at document retrieval tasks are simply unproven; 
the reverse is also frequently the case for certain conditions, informed 
decisions In specific instances appear to require explicit measurement of 
cost and effectiveness of all options; there are few, if any, "well accepted 
principles" In the area of document retrieval. 

2.2 Current NASA Scientific and Technical information Facility System 

2.2.1 Missions 


The Facility has the following primary missions: 


I ' ■ I 
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Acquisition and processing of the world's aerospace-related report 1 
literature. I 

Processing of the world's aerospace-related open literature acquired | 
from the American Institute of Aeronautics and Astronautics (AIAA). 
Announcement of current or selected literature to the aerospace cor.ru n! 
via Indexed abstract journals (l.e., Scientific and Technical Aerospace 
Reports — commonly called STAR) and also via selective dissemination 
of information (SOI) techniques (i.e., Selected Current Aerospace 
Wot tees — common 1 y called SCAN) . 

Providing initial distribution automatically of full documents on 
microfiche available In the collection to qualified recipients and 
secondary distribution of the microfiche or blowbacks of the documents 
to others upon authorized requests. 

Providing bibliographic research services to qualified individuals or 
organizations who either request these services through the Facility 
or have online interactive access through terminals using NASA's REmote 
CONsole (RECON) System. RECON is a derivative of Lockheed's DIALOG 
System. 

Providing retrospective search services and bibliographies resulting 
therefrom, to qualified recipients. 

Provide library support services and products for NASA and NASA 
affiliated libraries for cost effective functions such as: online 

research through RECOH; interlibrary network loan of books and periodi- 
cals; and preparation of catalog cards, book catalogs, shelf lists, 
acquisition lists, etc. 

Providing generalized support services for the dissemination of aero- 
nautical and aerospace information to the public at large through 
technology utilization programs. 


2.2.2 Indexing 


Other than those activities pertaining to the generation and distribution 
of microfiche and miscellaneous support services, all of these missions 
are associated in one way or another with the development or use of indexes. 
From the time the NASA Facility began operation early in 1962 until the end 
of 1967» an indexing philosophy closely related to that of the "Uniterm" 
system was employed. Adjectival word-forms were permitted as Index terms. 

The Indexing was at first essentially "free" of any constraints; although 
In later years, more and more dependence upon a published guide to subject 
indexes was required. 

] 

In 1966, NASA determined that a change in philosophy was In order, so that 
system performance could be Improved. NASA elected to prepare a thesaurus 
of aerospace terminology to be used as a vocabulary control authority. 

This NASA Thesaurus (N 1 ) was prepared during the latter part of 1966 and 196?. 
As a base for this vocabulary, NASA adopted terminological conventions pre- 
viously developed by the Engineers Joint Council and the Department of Defense. 

The NASA Thesaurus was first published In December, 19&7. Beginning with 
the accessions scheduled for announcement with the first Issues of the 1S&S 
volumes of the Scientific and Technical Aerospace Reports ( STAR ) and 
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International Aerospace Abstracts ( IAA ) , the NASA Thesaurus was used as 
an indexing vocabulary control tool and has been used as such until the 
present time. The NASA Thesaurus is a dynamic publication, in that it is 
updated periodically, with the latest publication occurring in 1976. 

At that time, there were 15,060 postable terms and 3,343 nonpostable terms; 
however, with pseudoterms and other entry terms which provide multiple 
access to the NASA Thesaurus , there were 35,801 entry points. 

2.2.3 Machine* readable Abstracts 


In 1971, another major change in processing techniques was employed at 
the NASA Facility with the advent of the implementation of an online input 
capability. This change was accomplished in conjunction with a change 
In hardware capability from one that supports the IBM 1410 (in emulation) to 
an IBM 360 Operating System mode, which is still in place. With the advent 
of this processing change, natural language abstracts in machine-readable 
form were saved and have subsequently been a part of the NASA Facility data 
bases. Abstracts are now routinely incorporated Into the data bases. 

Abstracts for which NASA Thesaurus indexing Is accomplished Is available 
online f,s follows: 

1. Scientific and Technical Aerospace Reports ( STAR ) , November 1971 to 
the present time. 

2. International Aerospace Abstracts ( 1 AA ) , January 1972 to the present time. 

3. Computer Program Abstracts (CPAT 7 January 1973 to the present time. 

4. Classified Scientific and Technical Aerospace Reports ( CSTAR ) (however, 
no classified data is available in the data base) , January 1975 to 

the present time. 

2.2.4 Searching 


The abstracts which are available are all text searchable with the NASA 
RECON system, and the titles for all citations In these four files are 
also text searchable. Texting searching in the NASA system is based on a 
hierarchy of document records, sentences within the records and words within 
the sentences. 

Searching in the NASA system can also be accomplished through the use of 
Personal Authors, Corporate Authors, NASA Thesaurus Subject Terms, Report 
Numbers, Contract Numbers, etc., as entry points. The NASA Thesaurus 
Subject Terms are generated by human indexers who determine six to twelve 
of the important Ideas and concepts in a document and select the most 
descriptive terms from the NASA Thesaurus . 

Major subject terms, selected to reflect the major concepts and research 
areas, are printed in the abstract journal 5 n a subject Index. To the 
user of the index, the published terms in combination with the title (or 
added title information, called a Title Extension) should permit a quick 
review of available material and assist in determining if a document is of 
further interest. Minor concepts are indexed for document retrieval only. 


ci 
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3. Central Algorithm 

3.1 A Procedure for Selecting Print Terms from an Abstract 

The primary problem for NASA, in moving toward a completely natural language 
abstract based system, is the need to prepare printed indexes. (There is 
no reason to presume that such a shift will do anything negative, and is 
very iikely to do many things positive, for on-line retrieval). A consen- 
sus of expert technical opinion is that satisfactory print terms can be 
generated in a primarily automatic manner. 

This section of this paper describes a suggested technique, NASA TERMS, 
which generates print terms in two parts. First, the idea of thesaurus is 
presented. Then a procedure is presented whereby en abstract can be pro- 
cessed "against" a thesaurus of the type described to produce a set of print 
terms. 

3.2 A Print Term Generation Thesaurus 


For maximal cost-effectiveness, on-line searchers require a thesaurus for 
the same reason indexers do — human memory is fallible. Stored lists of 
related terms help the searcher specify synonyms and more accurately des- 
cribe the subject in which he is interested. Since the printed index user 
does not have an on-line thesaurus to aid him, the producers of the index 
must map synonyms to a single posting term (and enable the user to determine 
the posting term). The computer is also capable of cost-effectively utilizing 
as many descriptive terms for a document as can be provided; printed indices 
are limited by economic considerations to k to S postings per document. 

The proposed Print Torm Selection Thesaurus is essentially the present NASA 
Thesaurus extended to include all of the "mappings" presently applied by 
the indexers "automatically," and extended to specify the "worth" of each 
posting term in the thesaurus, and also extended to indicate the areas of use 
of the Thesaurus Entry Phase (TEP) . 

The additional mappings required are what is normally known as entry postings, 
"see" postings or "use" postings. What is desired is to o Jify the rules 
used by indexers for selecting specific print terns. Initially, the thesaurus 
will need to be "educated." This can be done in several v/ays. The two 
most plausible are to add one or more existing thesauri which represent vo- 
cabularies that are generally available that contain special features worth 
having. Another is to process existing abstracts for which manually selected 
p. int terms are available. While the goal for the computer system is not to 
generate exactly the print terms manually specified, it is reasonable to ex- 
pect the computer to generate most such print terms, a few others, and terms 
more specific or more general than the manual choice. Where differences 
exist, lexicographers, or the original indexers, are very likely to be able 
to specify a "path" by which the computer could determine that a given man- 
ually specified print term is useful — if such a choice is indeed warranted. 

The worth of a print term is a tool to enable the computer to determine which 
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of several terms in a hierarchy is desirable (as described in the next sec- 
tion). For example, print terms which occur frequently in STAR are likely 
to be of less use than more specific terms. In any hierarchy, if several 
terms of varying breadth are plausible, only the most specific is normally 
selected as the print term. Similarly, if several ’'brother" terms are 
Indicated, only the general term which includes all of them is usually chosen 
as a print term. 


A similar problem occurs with ambiguous terms, e.g., sizing (shaping) and 
sizing (surface treatment), in order to automatically disambiguate such 
terms, the thesaurus needs to include rules for selecting one form of a term 
rather than the other. Non-ambiguous terms have known usage patterns. The 
area specified by the majority of the non-ambiguous terms can be used to 
select the proper posting for an ambiguous term. 


3-3 Types of Words 


Although one normally thinks of a thesaurus as a place to find the proper 
posting term for a given entry term, the NASA TERMS thesaurus also contains 
indications for two classes of words, in general, words can be divided into 
three types: (i) those generally useful for retrieval; (ii) those occas- 

ionally useful directly or useful for determining proper retrieval phrases; 
and (ii i) those never useful for retrieval. 


Words never useful for retrieval are functor words such as articles, most 
prepositions, conjunctions, and all forms of the verbs "to be," "to have," 
etc. These words typically separate the content phrases that are useful 
for retrieval. There are not too many of these words; most systems declare 
under 200 words to be of this type. Most of these words are also very fre- 
quently used, consisting of upwards of 50% of running text. Thus, it is 
useful to store these words in the main memory of the computer when pro- 
cessing text. 

Words occasionally useful are of three subtypes: functors, ambiguous, and 

qualifying. Functors like "and" and "of" are of no retrieval use in them- 
selves. However, they must be considered correctly in order to properly 
handle contextually adjacent useful terms. 

Occurrences of "and" will be deleted, obviously. Prior to deletion, however, 
surrounding phrases must be fully expanded. That is, constructs of the form 
"adjective noun and noun" must be rephrased as "adjective noun and adjective 
noun." (It is worth mentioning that this rephrasing will occasionally re- 
sult in an error, i.e., the adjective does not have to modify the second 
noun.) Experience has shown that rephrasing leads to fewer errors. 


Occurrences of "of" require special consideration so that phrases like 
"retrieval of information" and "angle of attack" are recognized as the same 
as the phrases "information retrieval" and "attack angle." 


Certain words are ambiguous with one usage !n the functor category and another 
usage useful for retrieval. The word "basic," as in the phrase "The basic 
use of ..." is ignorable. The chemical concept "basic" is not ignorable. 
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The face that a word Is ambiguous In this way Is a manual decision. "Basic" 
can be retained If a significant number of the other words in nearby sen- 
tences have a chemical usage. This obviously leads to an error when "basic" 

Is used In the general sense In tex* In a chemical area; e.g., consider the 
following sentence: The basic way to neutralize organic acids Is with or- 

ganic bases. (Such anomalies are easy to construct but rarely occur In 
practice.) 

Qualifying words are not ambiguous In that they are used in a single sense. 
However, "high energy physic-" is a phrase in which "high" must be retained; 
in most phrases, such as "In a high tail," "high" is not needed. 

Words of this middle class must be kept and used when appropriate but can 
be ignored when not in one of their special situations. Generally, such 
words should not be used to divide phrases, nor should they be used to try 
to find phrases. For example, it is more efficient to look up "energ" (the 
Stem of "energy") to find a "high energy physics" than to process all words 
following all occurrences of "high" for strings such as "energy physics." 

Words not in the two preceding classes should be mapable into a valid the- 
saurus entry phrase. Any word not in the thesaurus is automatically listed 
for manual review. In addition, any phrase containing words which are present 
in other phrases but not in any phrase found in the thesaurus should be 
posted to a "strange usage" file for manual review. 

Proximity requirements could also be used in the thesaurus, e.g., if "variable" 
or "changeable" occurs in the same sentence with the term "sweep" and the 
term ,, wings," post the term "variable sweep wings." 

Use of stems can drastically shrink the size of a thesaurus; this is impor- 
tant more for human understanding of the thesaurus than for the reduction in 
computer storage possible. Normally, a stem can be given with no restric- 
tion on the possible suffixes, occasionally it may prove necessary to require 
one of standard sets of suffixes or even one of a specific list of suffixes. 

As an example, most any occurrence of some form of "varying," e.g., "variable," 
"varys," in the same context with "sweep wing" or "sweep wings." (Note that 
Improper phrases can be constructed, e.g., "sweep wings cause varying tur- 
bulance patterns as speed increases." Such constructs are, fortunately, 
rare In actual text. How rare in NASA abstracts should be experimentally 
determined. ) 

3.4 The Basic NASA TERMS Algorithm 

Table 3"1 presents the steps included in the NASA TERMS algorithm. Many 
are steps required in the present manual indexing system and will be re- 
tained essentially unchanged, in an automated term selection system. The 
following paragraphs highlight areas of the algorithm which may be unclear 
from the terse phraseology of the table, (it should be clearly understood 
that the table and what follows is not considered to be a systems design; 
rather It is just an outline. A detailed specification of the algorithm is 
a logical next step.) 
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NASA Automatic Subject Analysts Technique for Extracting Retrievable Multi-terms 

(NAaiA TERM System) 

A System Which Generates Print Terms from Natural Text 

Table 3-1 


). Catalog document (as for norma) system). 

2 . Keyboard document surrogate, i.e., title, abstract (if present), and other impor- 
tant descriptive text, e.g., section headings, figures and table titles — between 
200 and 500 words. 

3. isolate words in surrogate; convert variable length words into a fixed length (32 
bit) number for the stem and a fixed length (8 bit) number for its suffix. 

4 . Based on a core-resident table, identify high-frequency, non-useful functor words. 

5- Split surrogate into phrases based on sentence boundaries and high-frequency, un- 
ambiguous functor words, i.e., articles, prepositions, pronouns, and conjunctions. 

6. Look up the first word of every phrase in thesaurus yielding part of speech for 
word and list of all entry phrases in the thesaurus which begin or incitJe given 
word. (This step frequently results in the replacement of a word by the stem of 
that word.) Continue as needed to locate list treatment of all non-ignorable words. 
Selection of the best phrase for a given surrogate word may require contextual 
judgements. 

7* The data from Step 6 Is used to specify the majority of candidate print phrases 
(CPP) for the surrogate. Note that the conjunctions located in Step 4 must be 
considered to resolve typical adjective - "and 11 - adjective noun phrases, etc. 

6. CPPs are ranked by the manually assigned weight obtained in Step 6 from the the- 
saurus. Typical weights are 50 for very high quality (specific) terms, 20 for 

average terms, 9 for single word terms and 4 for general single word terms. Terms 
found in titles are doubled in weight. \ 

9. CPPs which occur more than once are replaced by a single CPP with a weight equal 
to the sum of the Individual weights. 

i 

10 . The weight of each CPP is boosted by the proportionate weight cf any other CPP 

which contains any word stem(s) in common with the given ranking CPP. 

11. The CPPs are ranked and the top four CPPs are used as print terms. The fifth 

CPP is also used if the difference in weight of that CPP and the next CPP exceeds 
the difference in weight of that CPP and the preceding CPP. Similarly for sub- 
sequent CPPs. No CPP is used if its weight is less than 30. and anv CPP with 

a weight over 49 will be used. An attempt will be made to limit the number of 
CPPs to six. If two selected terms have a broader-narrower relationship in a 
thesaurus hierarchy, only the narrower is used as a print term unless the weight 
of the broader term exceeds 150% of the weight of the narrower term. 

12. The document surrogate is added to the online file, be it clustered or inverted. 
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If the abstract is not long enough for satisfactory processing, say 200 
words, the leading and/or terminating paragraphs of the document can be 
entered. It would probably be worth while to enter all section headings, 
Illustration and table titles, etc., as these items would increase recall 
for many queries; nevertheless, the number of words entered is not excessive -- 
in respect to the payoff — and the words can be quickly selected by clerical 
activity. 

Common misspellings could be entered into the thesaurus and automatically 
corrected. Normally an error and activity report of automatically applied 
corrections would be prepared for a lexicographer to confirm that any given 
“spelling correction" is proper and not some unanticipated valid word or an 
unanticipated spelling error for which the wrong correction was supplied. 

Additi onally, spelling, correction algorithms could be employed. 

Through the use of the number corresponding to a given stem, the set of 
phrases in which each word can appear is obtained from the thesaurus. 

The computer will determine any phrases present in the document which cor- 
respond to requirements specified in the thesaurus. For each such phrase, 
the thesaurus print term which corresponds is added to the document record. 
Words of some importance for which no satisfactory entry phrase can be 
determined are included in an "exceptions report" for manual processing. 

It Is important to realize the necessity of including some manual processing. 
Words which are new must be added to the thesaurus. Words which have occurred 
In phrases not anticipated in the thesaurus should be looked at again. Unusual 
documents should be examined. All of these actions should result in improve- 
ments in the system. Such improvements (i.e., additions) will be very frequent 
when the system is young. The number of improvements will decrease with 
time but never to the zero level. It is anticipated that new terminology 
and typographical problems (not all of which are simple keyboard errors) 
will keep several lexicographers and indexers busy full-time at the Facility. 
The main intent of automatic processing is to record every manual decision 
when that decision is made so that there is rarely the need to make the same 
decision repeatedly. 

In order to prove that such a process can generate suitable print terms, 
several experiments are recommended. As each is "passed," it will be rea- 
sonable to incur the cost of the next- It would be unwise to invest tine 
and money in advanced, complex experiments until ail parties are convinced 
that the result is likely to be an acceptable indication of usefulness or 
non-usefulness of the technique under test. 

Phrases which contain word roots that appear in other phrases are additionally 
boosted by the proportionate weight of those other roots. This technique 
will highlight long, and thus highly specific, phrases that are not likely 
to be repeated in an abstract. Where two phrases, typically single word 
phrases, differ only in suffix, the more frequently occurring form is used. 

3 . 5 Thesaurus Entry Phrases H ve Normal "Areas of Use 1 1 


All NASA abstracts are published in one of /b categories. Tne NASA Thesau- 
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contains about 2,000 distinct hierarchies and most entry phrases are pre- 
sent in a (single) hierarchy. These two ideas can be used to define an 
"area of use" for each entry phrase. The proper category for a phrase can be 
determined by "looking" at the category used for abstracts for which that 
phrase is a descriptor. Naturally this process is a one-time batch run which 
would augment the record for each phrase with the number of times that phrase 
was used in each area. Most phrases are used exclusively within three or four 
categories and any phrase which was used to describe only one or two abstracts 
in a given category can be dropped from that category. With these "areas 
of use" known, ambiguous entry phrases can be properly classified (based on 
the consensus of the "areas of use" of unambiguous phrases). This technique 
night also allow selection of a more specific phrase such as "wing root", 
rather than the general term 'root" when a surrogate uses only the term 
"root" at some point. 

The mapping of "root torsion" into "wing roots" (and "torsional stress") 

In the first example (Exhibit A), will strike many as far fetched for an 
automatic system. However, the mapping is straight-forward when one con- 
siders what should be done with the word "root." In the thesaurus, it 
appears as: root mean square error, roots, roots of equations, plant roots, 

and wing roots. If one assigns entries to general categories, i.e., "most 
likely to be used in area...," then the above entries, except for the ambigu- 
ous "roots," have definite areas. The use in this sentence is from the same 
subject area as "wing roots"; hence, "wing roots" is the proper index point. 

Examples of NASA TERMS for Automat i cal ly Generating Print Terms 

This section shows the way a computer could generate print terms for several 
abstracts. The process is manually done, i.e., it has not yet been programmed. 
However, every attempt has been made to be fair; no decision has been made 
that is not believed to be easily programmed. The computer selected print 
terms were obtained withojt knowledge of the NASA expert's choices which 
are Included for comparison. Note please, duplication of the expert's choices 
is not a goal; an equally useful set for STAR is the goal. 


Four simulations are presented. The first two are doctoral dissertation 
abstracts from STAR . The third is an author written report abstract from 
STAR . 'he fourth is an AIAA prepared abstract from 1AA . Each simulation 
is presented as one exhibit in several parts. The first part shows the 
surrogate as published. The second part shows the RECON display for the 
Surrogate. The third part of each exhibit shows the words in each title 
and abstract which are totally ignorable (underlined) or ignorable except 
when used In entry phrases. These words and sentence ends separate surro- 
gate entry phrases (SEPs). The fourth part of each exhibit shows (in varying 
degrees) how each SEP is processed. At a minimum, each candidate p> Int 
phrase (CPP) selected is shown with the weight assigned that SEP. in many 
cases, the reasoning behind the selection is indicated. Any SEPs the system 
cannot handle well are noted and would appear on a list of such SEPs for 
manual review. The fifth part of each exhibit lists the CPPs in weight order 
showing all components of the total v/eight and showing which CPt s are ac- 
tually selected as print terms by the NASA TERMS algorithm. The -.ixth part 
of each exhibit lists the manually chosen print and non-print term;. Terms 
selected or identified by the NASA TERMS and those produced manually are 


' 1 
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so Identified. Other terms are commented upon. Please note again that 
duplication of manual choices i a not goal — only the selection of terms 
equally (or more) usefu,' for retrieval is a reasonable goal. 


N77-29090 Georgia ln*t o' Tech.. Atlanta 

A METHOD OF COMPUTING THE POTENTIAL FLOW ON 

'•KICK WIND TIPS Ph D. Thetia 

Predeep Rao 1978 174 p 

Avert: Unhr Microfilm* Order No 77?35i 

An iterative procedure to eomc jt* deteiled velocity and 
preesure distributions on the surface of thick wing tip* •« developed 
using potential flow theory. Tie method uses i (wo dimensional 
surface vorticity d'Str,bution as an init'jl approximation Therefore, 
the two dimensional problem is first formulated m the form of 
an integral equation tvng vortieiry as the surface singularity 
which is solved by the t'tmemary vortex distribution technique 
A comparison of the flow computed on s circular cylinder with 
the exact enalytir.al results provides e measure cf accuracy The 
two dimensional nuncirculjtory and circulatory flew is computed 
lor k basic thickness form airfoils Dissert Abstr 


First Surrogate as Published (STAR) 
Exhibi t A-l 


77N29090 ISSUE 20 PAGE 2621 CATEGORY 2 76/OC/OO 174 PAGES 
UNCLASSIFIED DOCUMENT 

A METHOD vF COMPUTING THE I ’.TENIIAL FLOW ON THICK WING TIPS PH.D. 
THESIS 

A/ RAO, P. 

GECPCIA INST- OF TECH.. ATLANTA. AVAIL UNI V. MICROFILMS ORDER 
NO. 7 7 - 7 3 5 2 

/^POTENTIAL FLCw/*PRES.>U.RE C I S T R I l> U T I C.\ / ^VELOCITY MEA SUREMEN T/ *WI NG 
TIPS/ AIRFOILS/ FLUW THEORY/ TWC DIMENSIONAL FLOW/ VORTICITY 

' U A DISSERT. ABST=. 

AoS AN ITERATIVE PROCEDURE TC C0MPU1E DETAILED VELOCITY AND PRESSURE 
DISTRIBUTIONS CM THE SURFACE OF THICK WING TIPS IS CtVcLCPEO USING 
POTENTIAL FLOW THEORY . THE M.TFCD LSES A TWO DIMENSIONAL SURFACE 
VORTICITY DISTRIBUTION AS AN INITIAL APPROXIMATION. THEREFORE, THE TWO 
DIMENSIONAL PROBLEM IS FIRST FORMULATED IN THE FCPM OF AN INTEGRAL 
EQUATION USING VORTICITY AS TEE SURFACE SINGULARITY WHICH IS 5ULVED BY 
THE ELEMENTARY VORTEX DISTRIBUTION TECHNIQUE. A COMPARISON OF THE FLOW 
COMPUTE) ON A CIRCULAR CYLINDER W T 7m THF EXACT ANALYTICAL RESULTS 
PROVIDES A MEASURE OF ACCURACY * THE Two DIMENSIONAL NCNC l RCULA T C f t AND 
CIRCULATORY F LC W IS COMPUTED FCP NACA BASIC THICKNESS FORM 
AIRFOILS. 


RECON Display of First Surrogate 


Exhibit A-2 


; .IT f 1 VTt 11 


1 : 1 " 


* * - ’ - 1 


Jack Kirschbaum 


13- 


A METHOD Of COMPUTING IHf POTENTIAL FLOW ON ITHICK lWING TIPS 


An Interatlve procedure to compute detai led veloci ty Vand) pressure 
distributions on the |surfac4U^[tiTi ck) wi ng tips is developed using potential flow 
theory. The method uses a two dimensional surface vortlcity distribution as an 
Initial approximation. Therefore , the two dimensional problem Is first formulated 
in the form of an integral equation using vorticity a£ the surface singularity 
which is solved by the elementary vortex distribution technique. A comparison 
of the flow computed on a_ circular cylinder wi th the exact analytical results pro- 
vides a measure of accuracy. The two dimensional none! rcu 1 a to ry'^ and] circulatory 


flow is computed for NASA basic }thi ckness] form airfoils. 

Dissert. Abstr. 

Phrase Isolation 


ORIGINAL PAGE IS 
OF POOH QUAUT V 


Exhibit A~3 


Text Phrase Reduction 
Exhibit A-J* 


computing ^computation, weight 9 

potential flow ^potential flow, weight 20 ; 

wing tips ■> wing tips, weight 20 

Iterative procedure ^iterative procedures, weight 20 

compute > computation, weight 9 

velocity distributions ^velocity distribution, weight 20 

pressure distributions pressure distribution, weight 20 


wing tips ^wing tips, weight 20 

potential flow theory ^ potential flow, weight 20 

two dimensional surface vorticity distribution >two dimensional boundary layer 


flow, weight 50, & vorticity, weight 9 

initial approximation ^initial approximations, weight 20 

two dimensional problem ^ two dimensional, weight 3 

integral equation ^ integral equations, weight 20 

vorticity. ^vorticity, weight 9 


surface singularity > surface properties, weight 9; singularity (mathematics), weight 9 

elementary vortex distribution technique >vorticicy, weight 20 

flow computed ^ flow computation, weight 20 


c • ular c/1 i nder. 


circular cylinder, weight 20 
analytical results ^(phrase not found, posted for manual review) 

two dimensional nonci rculatory and circulatory flow ^two dimensional flow, weight 20 


£ nonci rculatory flow, weight 20, £ circulatory flow, weight 20 

NACA basic thickness form airfoils > NACA, weight 20, t thickness form airfoils 

airfoil profiles, weight 20 
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ORIGINAL PAGE IS 


KASA TERMS Selections 
Exh ib? t A-5 


OK POOR QUALITY 

Occurrence 


Co-Occurring 


rint Terns 

Weights 

Sum 

Boost 

Totn 

Two di mens ional boundary 
layer flow 

50 

50 

53 

103 

Potential flow 

2*20,20 

60 

33 

93 

Two dimensional flow 

20 

20 

51 

71 

Flow computation 

20 


60 

80 

^Circulatory Flow 

20 


46 

66 

*floncirculatory flow 

20 


46 

66 


fon-Princ Terns 


Wing tips 

2*20,20 

60 


60 

Vorticity (including vortices) 

9,9,20 

38 


38 

Computation 

2*9,9 

27 

7 

34 

Two dimensional 

9 


22 

31 

Velocity distributions 

20 


10 

30 

Pressure distribution 

20 


10 

30 

Integral Equations 

20 



20 

Air foil prof i !es 

20 


. 

20 

Initial approximations 

20 



20 

Iterative procedures 

20 



20 

Circular cylinder 

20 



20 

NACA 

20 



20 

Surface properties 

9 



9 

Singularity (mathematics) 

9 



9 

)ropped as print terms despite weight 

due to "broader/n 

arrewer" 

relationships. 



Manually Selected Terns 
Exhibit A-6 


Prlrt Terns 


Comments 


Potential Flow 
Pressure Distributions 
Velocity Measurements 
Wing Tips 


Common selection 

Usefulness as a print term is questionable 
Incorrect term - should be "Velocity Distrib; 
Common selection 


Search Terms 


Ai rfoi 1 s 
Flow Theory 
Two Dimensional Flow 
Vortici ty 


Important and underrated b both techniques 
Too important not to be used as a print tern 


C*n* w rtrr sMocV'd print terns rot above 
Two dimensional boundary layer flow 


Requires recognition of parallel way of 
describing. 
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*77-29089 Catt'o.-m* Umv . Los Ang-lej 

THE COUPLED FLAP -LAG -TORSIONAL AEROELASTIC 

?J^* ,UTV ° f M ^COPTER ROTOR BLADES IN FORWARD 

FLIGHT Ph D. Thasis 

Manual flayna Allende 1976 295 p 

Avatl Uni* Microfilms Order No 77-8530 


A sat of cowplad flap lag torsional aquations of motion capable 
« simulating general hingeless rotor blade coi.:<gurations are 
,h# C,JI of * rotor blade having moderate deflections. 
Tito final equations of motion are represented by a system of 
coupled, nonlinear partial differential equations. The equations 
•re capable of simulating rotor blades having. (1) precone: (2) 
JPOOp. (31 built in twist; (4) distributed torsion; (5) root torsion 
jorp tch link fleeibdity); (6) blade root offsets; «7» and offsets 
between the elastic axis, eerooynamic center and the blade cross 
sectional center of mass Ouasisteady aerodynamic loads are 
m,ii J’ v * ef, tct% of stall and compressibility are neglected. 
Reversed flow is included in the representation of the airloads. 


Dissert. Abstr. 


Second Surrogate as Printed (STAR) 
Exhibit B-l 


77N29039 ISSUE 20 PAGE 2620 CATEGORY 2 76/0C/00 295 PAGES 

UNCLASSIFIED DOCUMENT 

THE COUPLED FLA P-LAS-TCR S ION AL AEROELASTIC STABILITY CF HELICCPTEP 
ROTOR BLADES IN FCRWARO FLIGHT 
PH ,0 . ThESIS 
A/KEYNA-ALLENUE, M. 

' CAL I FCRNI A UNIV.. LOS ANGELES. AVAIL UNIV. MICRCFILMS ORDER NO. 
77-8530 

/ <=AEF CCYN AN I C ST Ad I L I TY / *SC L A T I C N S OF MOT I Cfi/ *HLL I COP T ERS/ «RCTCR 
ELAQES/ AERODYNAMIC LOADS/ AIRCRAFT Cl.NF I G JR A T I C NS/ NONLINEARITY/ 
PARTIAL DIFFERENTIAL ECUA7ICNS 
ABA CISSERT. ABSTR. 

ABS A SET OF COUPLED FLAP LAC TORSIONAL EOLAMCNS DF MCTICN CAPABLE 
CF SIMULATING CENE=AL HlNOFLESS ROTOR GLACE CONFIGURATIONS AFE DERIVED 
FOR l.H CASE CF A FUTOR BLADE HA v I G ;CJERATE DEFLtC TIUNS . THE FINAL 
EOUATILrS CF MOTION ARE REPRESENTED dY A SYSTEM CF CUUPLEC, NONLINEAR 
PARTIAL DIFFERENT I Ac FU'JATTCNS. TEE ECUATICNS ARE CAPABLE OF SIMULATING 
FOTOK BLADES HAVING ID PRECCNE; (2) JROuP; (3) BUILT IN TWIST; (A) 
DISTRIBUTED TlJR SIGN; (5) Rl CT TERSICN U.R PITCH LINK FLEXIBILITY); (6) 
BLADE FUCT OFFSETS; (7) AND OFFSETS BETWEEN THE ELASTIC AXIS* 
AERODYNAMIC CENTER AND THE BLADE CFCS5 SECTIONAL CENTER OF MASS. 

CJASI STEADY AERODYNAMIC LOADS AFE USEC AND THE EFFECTS OF STALL AND 
COMPRESSIBILITY ARE NEGLECTEC. REVERSED FLOW IS INCLUDED IN THE 
REPRESENTATION OF THE AIRLOADS. 


RECON Display of Second Surrogate 
Exhibit B-2 
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ORIGINAL PAGE IS 
OF POOR QUALITY 

THE COUPLED FLAP-LAG TORSIONAL AER0ELAS71C STABILITY OF HELICOPTER ROTOR BLADES IN 
FORWARD FLIGHT 

A set of coupled flap lag torsional equations fof? rnotior, capable/of^ 

simulating general hingeless rotor blade configurations are derived for the 

case rotor blade having moderate deflections. The final equations jofj mot ion 

are represented by a [system) of coupled, non! inear partial differential equations. 

The equations are capable of simulating rotor blades having: (l) precone; 

(2) droop; (3) built in twist; (k) distributed torsion; (5) root torsion (or pitch 

■Ink flexibility) ; (6) blade root offsets; (7) ]a"nd“j of fsets between the elastic axis, 

aerodynamic center {and? the blade cross sectional center jof/ mass . Quasisteady 

aerodynamic toads are used and the effects of stall fand/ compress I b 1 1 j ty are 

neglected . Reversed flow is included in the representation JaP the airloads. 

Phrase Isolation 
Exhibit B-3 


Text Phrase Reduction 
Exhibit B-L 

Coupled flap lag torsional aeroelastic stability ^ 

(coupled ignored, posted for manual review) 

flaps (control surfaces), weight 20 (lag not in thesaurus and ignored, posted for 
manual review) 
torsional stress, weight 20 
aeroelastici ty, weight 9 
stability, weight 9 . 

Helicopter rotor blades ^ rotary wings, weight 20 

Forward Flight ^Flight (Forward dropped as u.r.nportant) , weight 9 

coupled flap lag torsional equations of motion ■ -^ coupled flap lag torsional - see above 

equations of motion, weight 20 

simulating ^ hingeless rotor blade 

simulation (general dropped), weight 20 
rigid rotor, weight 20 

rotary wings (configuration dropped), weight 20 

rotor blade > rotary wings, weight 20 

deflections } deflection, weight 9 

equations of motion ^equations of motion, weight 20 

coupled, nonlinear partial differential equations. non 1 inear equations, weight 20 

partial differential equations, weight 20 
equations ^equations, weight 9 
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Text Phrase Reduction 
Exhibit B-4 
(continued) 

simulating rotor blades ^simulation, weight 20, rotary wings, weight 20 

(precone, droop and built in twist are all lost since not in Thesaurus), posted for 
manual review 

distribution torsion ^torsional stress (distribution dropped), weight 20 

root torsion >wing roots, weight 20; torsional stress, weight 20 

(pitch link flexibility) >(too strong to ignore, human assistance requested) 

pitch attitude control ^longitudinal control, weight 20 

Blade root offsets > rotary wings, weight 20, 6 wing roots (offsets dropped), weight 

elastic axis > (too strong to ignore, human assistance requested) 

aerodynamic center ^aerodynamic configurations, weight 20 

blade cross sectional center of mass ^ rotary wing, weight 20, 6 cross sections, 

weight 9, & center of gravity, weight 20 

quasisteady aerodynamic loads ^aerodynamic loads (quasisteady dropped), weight 20 

stall > stalling, weight 9 

compressibility > compressibility, weight 4 

Reversed flew ^reversed flow, weight 20 

airload$__^. (new word, if second occurrence, request manual assistance) 



NASA TERMS Selections 
Exhibit B-5 


Print Terms 

Occurrence 


Co-occurring 

Tot 


We i qhts 

Sum 

Root Boost 

V't - 

f 

rotary wings 

2*20,20,20,20,20,20 

14C 

15 

15 

i 

» 

torsional stress 

2*20,20,20,20 

100 


10 

i 

wing roots 

20,20 

40 

47 

8 

\ 

rigid rotor 

20 

20 

47 

6 

I 

v 

flaps 

2*20,20 

60 

6 

\ 

equations of motion 

20,20 

40 

16 

5 

\ 

Mon-print Terms 






nonlinear equations 

20 

20 

22 

L 


partial differential equations 

20 

20 

2) 

4 


simulation 

20,20 

40 


4 


equations 

9 

9 

24 

3 

e 

aerodynamic loads 

20 


7 

**» 

4U 

' 

aerodynamic configuration 

20 


7 

# > 

k 

i 

center of gravity 

20 



o 

4 - 

i 

* 

reversed flow 

20 



? 

i V 

longitudinal control 

20 



2 

P 

f 1 ight 

2*9 

18 


1 

i 

stabi 1 i ty 

2*9 

18 


1 

a 

asroelasticity 

2*9 

18 


i 

^ . 

cross sections 

9 




i 

sta 1 1 ing 

9 





def lect i ons 

9 




4 

compress ibi 1 i ty 

4 

• 



p 
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OF POOR QUALITY 
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HanuaMy Selected Terns 

Exhibit 8-6 

Correrts 

Of questionable utility 
Selected by both 
Too general but on target 

Bad choice - refers to turbines not helicopters 


Excellent descriptor 

Too highly rated; okay as a search term 
not as a print term 

Highly descriptive, perhaps too specific 
for a print term, however. 

Too descriptive not to include. 


*77-29097*/ N»' ■jo si Aeronautics and Space Administration 
Langley Research Center, Langley Station. Va 
LOAD DICrt»iBUTION ON A CLOSEO-COUPLEO WING 
CANA.rtP AT transonic speeos 

P'.«*r S. Gloss end Karen E Washburn Aug 7977 It p refs 
JWASA-TM-74053) Avail NTIS HC A02/MF AOt CSCl OIA 
A wind tunnel test where load distributions were obtained 
at transonic speeds on both the canard and wing surfaces of a 
Closely coupled wing canard configuration is reported Detailed 
component and configuration arrangement studies to provide 
msight into th# various aerodynamic interference effects for the 
leading ad ga vortex flow conditions encountered are included 
data indicate that increasing the Mach number from O 70 to 
0,95 caused the wing leading edge vortex to burst over the 
wing when the wing was in the p-csence of the high canard 

Author 


Print Terrs 

Aerodynamic stability 
•quatlons of motion 
helicopters 
rotor blades 

Search Terms 

aerodynamic leads 
aircraft configurations 
nonlinear! ty 

partial differential equations 

Computer terms net manually selected 

Rotary wings 
Ving roots 

Rigid rotor 

Torsional stress 


Third Surrogate as Published (STAR ) 
Exhibit C-I 
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77N29 **■-!* ISSJE 2D PACE 7hl2 CATEGORY > NA SA— TM-74353 

77/ LB/ >: 11 PAGES UNCLASSIFIED DCCUME “«T 

LOAD DISTRIBUTION CN A CLCSE C-CCUPL EO WING CANARD AT TRANSONIC 
SPEEDS 

A/GLCSS, B. B.J B/WASHBUPN, K. E. 

NATIONAL AERONAUTICS AND SPACE ACMINI STRAT iCN. LANGLEY RESEARCH 
CENTER, LANGLEY STATION, VA. AVAIL. NTIS HC A02/MF ACl 

/♦CANARD CONE IGURAT ICNS/*LCAC DISTRIBUTION ( FORCES ) /♦TRANSONIC 
SPEED/^ ii INGS/ AIRCRAFT STRUCTURES/ FLOW V I SUAL I Z AT I CN/ MACH NUMBER/ 
VCRTICES/ WIND TUNNEL TESTS 

ABA AUTHOR 

ABS A WIND TUNNEL TEST WHERE LCAC DISTRIBUTIONS WERE CBTAINCD AT 
TRANSONIC SPEEDS CN BOTH THE CANARC AND WING SURFACES CF A CLOSELY 
COUPLED WING CANARD CONF IGURAT ICN IS REPORTED. DETAILED COMPONENT AND 
CONFIGURATION ARRANGEMENT STUDIES TO PROVIDE INSIGHT INTO THE VARIOUS 
AERODYNAMIC INTERFERENCE EFFECTS FCR THE LEADING EDGE VOPTEX FLCW 
CONDITIONS ENCOUNTERED ARE INCLUDED. CATA INDICATE THAT INCREASING THE 
MACH NUMBER FROM 0 • 73 TO C.95 CAUSED THE WING LEADING EDGE VORTEX TO 
BURST OVER THE 'WING WHEN THE WING WAS IN THE PRESENCE CF THE HIGH 
CANARC. 


RECON Display of the Third Surrogate 
Exhibit C-2 


Title: Load distribution on a Closed-coupled Wing Canard at 


Transonic Speeds 

Abstract: A wind tunnel test where load distributions were obtained 


at transonic speeds on both the canard and wing surfaces |of] a 
closely coupled wing canard configuration Is reported. Detailed component 
[and configuration arrangement ^studiesj to provide insight into the various 
aerodynamic Interference effects j for the leading edge vortex flow 
conditions encountered are Included. Data Indicate that Increasing the 


Mach 


number 


f rom] 0 . 70 [to 0.95 caused the wing leading edge vortex [toj burst 


over the wing when the v/ing was in the presence of the ^high canard. 


Phrase Isolation 
Exhibit C-3 
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ORIGINAL PAGE IS 
OF POOR QUALITY 
Text Phrase Reduction 
Exh i b i t C-5 


Lo.'d: 

Closed-coupled: 

Closed: 


Load distribution (forces) — closest match, weight 20, Load Factors/ 
Loads (forces), and 31 others which include the root "load." 
not found 

Close packed lattices 


Wing: 


Canard: 
Tra nsonic : 


Basins 

Circuit Television 
Cycles 

Ecological Systems 
Faults 

Loop Systems 


All occurrence., of "CLOSE, 
in NASA Thesaurus Access 
Vocabulary 


Closed 
Closed 
Closed 
Closed 
Closed 
Closed 
Closing 
Closure Law 
Closures 

: No satisfactory usage found 

Coupled: Charge Coupled Devices 

Coupled Modes 
Couplers 

Antenna Couplers 
Couples 
Coupling 

Coupling Circuits 
Coupling Coefficients 
Cross Coupling 
Gyroscopic Coupling 
Microwave Coupling 
Optical Coupling 
Spin-spin Coupling 
Thermodynamic Coupling 
Coupl ings 

: No satisfactory usage 

Fan in Wing Ai rcraft 
Fixed Wing Ai rcraft 
(through 6k such phrases, none 
Wings, best match, weight 9 
Canard Configurations, weight 
Transonic Aircraft/supersonic 
Transonic Aircraft Technology 
Transonic FI i ght 
Transonic Flow 
Transonic Flutter 
Transonic Inlets/Supersonic Inlets 
Transonic Nozzles 

Transonic Speed — term matches surrogate, weight 20 
Transonic Turbines/Supersonic Turbines 
Transonic Wind Tunnels 
Transonics/Transonic Flow 

Wind: (kk entries including) Wind Tunnel Tests, weight 50 

Load: (33 entries including) Load Distribution (Forces), weight 

Transonic, canard, wing -- see above 


involving CANARD) 

50 

ai rcraft 

Program/TACT Program 


50 



Te xt Phrase Reduction 
Exhibit C-A "" 
(continued) 


Surfaces: (68 entries including) 

Surface layers 
Surface Properties 
Surfaces — best match, weight 9 
Control surfaces 
Tail Surfaces 
Lifting Surfaces/Surfaces 
Closely Coupled Wing Canard Configuration 
Component : 

Configuration: 


20 


Lift Devices 6 Lifting Bodies 
see above 

specific uses), weight 9 


Aerodynami c: 
Leading: 


Vortex: 

Mach: 


Components (and 7 weight 
(16 entries including) 

Hammerhead Configuration 
Configuration Management 
Configurations 
Aerodynamic configurations 
Aircraft configurations 
Body-Wing Configurations 
Body-wing and Tail Configurations 

Canard Configurations - strengthened since previously selected, weight 50 
Launch Vehicle Configurations 

(26 entries including) Aerodynamic Interference, weight 20 
(15 entries for LEAD and) 

Leadership 
Leading Edge Slats 
Leading Edge Sweep 

Leading Edges — best match, weight 9 
Sharp Leading Edges 

(17 entries including) Vortex Flow/Vortices, weight ? 

Mach Cones 

Mach Inertia Principle 

Mach Number — closest match, weight 8 without numbers 

Critical Mach Number 

Mach Zehnder Interferometers 

From 0-70 with "Mach" induced number evaluation: 
weight 20 


To 0.95, with "Mach" 

Wing, Leading Edge, Vortex — see above 
Burst: Bursts — best fit, weight 

Meteor Bursts 
Radio Bursts 
Solar Radio Bursts 
Type 2.3. 1 * or 5 Bursts 
Wing, Wing - Canard — see above 


Subsonic speeds, 
Transonic speeds, weight 20 


| 


I 


! 

I 
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ORIGINAL PAGE IS 
OF POOR QUALITY 
NASA TERRS Selections 
Exhibit C-5 


Occurrence Co-occurring 

Terms Weights Sum Boosts To:r 1 


Car.arc Configurations 

2*50,50,50,50,50 

300 


30C 

Transonic speeds 

2*20,20,20 

80 

10 

9C 

Wings 

2*9,9t9»9,9,9 

63 


63 

Load distribution 

2*20,20 

60 


6C 

Wind Tunnel Tests 

50 

50 


5C 

Subsonic Speeds 

20 

27 


47 

Aerodynamic Interference 

20 

20 


2C 

Leading Edges 

9.9 

18 


1£ 

Vortices 

9,9 

18 


If 

Burst 

9 

9 


c 

Surfaces 

9 

9 


c 

* 

Components 

9 

9 


c 


Print Terms 


Manually Selected Terns 
Exhibit C-6 

Comments 


Canard Configurations 
Load distribution (forces) 
Transonic speeds 
Wings 

Non-Print Terms 


Common to both 
Common to both 
Common to both 
Common to both 


Aircraft structures 
FI ow vi sua 3 izat ion 
Mach Number 
Vortices 

Wind Tunnel Tests 


Why? 

How? 

Plausible 

Reasonable, common to both 
Should be a print term considering 
usefulness for retrieval 


A78-11362 Solar electric-energy market penetration. R. K. 

Sarin and K. Nair (Woodward Clyde Consultants, San Francisco. 
Calif.). In: International Solar Energy Society, Annual Meeting. 
Orlando, Fla., June 6-10, T977, Proceedings. Sections 26-38. 
(A78-1121201-44) Cape Canaveral, Fla., International Solar Energy 
Society, 1977, p. 28-13 to 23-17. 

A Beyesian approach was employed to forectst the solar electric 
market penetration by the years 1990 and 2000. The study 
identified a multitude o/ factors, including relative cost of com- 
petitive energy systems, government incentives, future environmental 
regulations, and new technologies, that would affect the solar market 
lhart. The judgments of several experts from utility companies, 
government agencies, and research laboratories were utilized in a 
tyttematic manner to quantify the probability distributions of future 
lolar market share as a function of the various factors. The likelihood i 
of the occurrence of these factors was also assessed, and the solar 
market share was forecasted for the most-likely future scenarios. 
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78A11362 ISSUE I PAGE 79 CATEGORY 44 77/OO/CO 5 PAGES 
UNCLASSl F I EC DOCUMENT 

SOLAR ELECTRIC-ENERGY MARKET PENETRATION 

A/SARIN* R. K.; B/NAIRt K. B/ ( WOCDWARU- CLY D E CONSULTANTS. SAN 
FRANCISCO, CALIF.) 

IN INTERNATIONAL SOLAR ENERGY SOCIETY, ANNUAL MEETING, QRL ANCC , 
FLA.. JUNE 6-10, 1977, PROCEEDINGS. SECTIONS 26-33. IA78-11212 01-44) 
CAPE CANAVERAL, FLA., INTSRNATICNAL SCLAR ENERGY SOCIETY, 1977, P. 

28-13 TO 28-17. 

/♦ELECTRIC PCWER/*MARKET RE SE ARCH /♦SOLAR ENERGY/*TECHNCLCG I CAL 
FORECASTING/ BAYES THEOREM/ CLEAN ENERGY/ COST INCENTIVES/ ENERGY 
TECHNOLOGY/ PROBABILITY CISTRIBLTICN FUNCTIONS 

ABA (AUTHOR) 

ABS A BAYESIAN APPROACH WAS EMPLCYEC TO FORECAST THE SCLAR ELECTRIC 
MARKET PENETRATION BY THE YEARS 199C AND 2000. THE STUDY IDENTIFIED A 
MULTITUDE OF FACTORS. INCLUDING RELATIVE CCST OF COMPETITIVE ENERGY 
SYSTEMS, GOVERNMENT INCENTIVES, FUTURE ENVIRONMENTAL REGULATIONS, AND 
NEW TECHNOLOGIES, THAT WOULC AFFECT THE SCLAR MARKET SHARE. THE 
JUDGMENTS OF SEVERAL EXPERTS FRCM UTILITY COMPANIES, GOVERNMENT 
AGENCIES, AND RESEARCH LABORATORIES WERE UTILIZED IN A SYSTEMATIC 
MANNER TC CUANTIFY THE PROBABILITY DISTRIBUTIONS OF FUTURE SOLAR MARKET 
SHARE AS A FUNCTION OF THE VARIOUS FACTORS. THE LIKELIHOOD CF THE 
OCCURRENCE OF THESE FACTORS WAS ALSO ASSESSED, AND THE SOLAR MARKET 
SHARE WAS FORECASTED FOP THE MOST-LIKELY FUTURE SCENARIOS. 
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Abstract: A B av e s i a n£ap p roaci^ was. employed to forecast the solar 
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electric market penetration bv che y ears 15° , 0 2000 . The / study/ 


identified a mu ltitude of factors; i nelud inn r elati v - : cost of..* com- 
petitiv' energy ssys tens , government incentives, future environmental 
regulations, ,anri. ne.v.'j technologies, that would affect th e solar market 
share . The judoments of s evere 1 e xperts f rom u ti 1 i ty fcompan i esj 
government agencies . land . research laboratories were u t i 1 i eed i n a 
systemat i c- runner to quanti fy t he probability distributions of future 


solar market share e c _ r. func t i n r ■ of the various Tactors^ The likelihood 
of the occurrence o : these ■ a c t o r i " • a s als o i s s e s s > o' . ~inci< t. he solar 
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market share was. forecasted for the most-likely f uture scena rios . 
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Text Phrrse Reduction 
Exhibit 0-4 


Solar Electric-energy market penetration: 

Solar energy — since “solar is not a valid terni and the only “solar 
electric’* is “solar electric propulsion” which failes the subject 
area test. Due to the large number of solar phrases, the phrase 
solar electric energy will be reported as a candidate phrase for 
nanual review, weight 20. 

Electric po'wer — “electric” alone is net an entry phrase; therefore, 
use "electric power” via “electrical energy" as best natch, weight 2C 

Market — Marketing, v/eight 9 

Penetration — Penetration weight 9, flagged for manual review since use 
is not in document having substance in normal areas of use, specifically 
"geology” or "rr.etalic materials" 

Bayesian: Bayes theorem via “Bayesian Statistics" the only term including 

"Bayesian," v/eight 20 

» 

Forecast: Technology Forecasting, weight 20 

Solar electric market penetration: (see above) - posted for manual revision as a re- 

curring phrase not in the thesaurus. 

Cost: Costs, weight 20 j 

Energy: Electric Power, in preference to "Energy,” weight 20 

t 

Government Incentives: Government, weight 9 “ Cost incentives, weight 20, rather 

than "incentives,” weight 9 j 

Environmental regulations: Environmental control, weight 20, via "regulations” 

being "used” to “control" 

Technologies: Technology forecasting, weight 20, in preference to Technologies, weight 

Solar market share: Solar energy, weight 20, in preference to "solar, weight 9; 

Marketing, weight 9 

Utility companies: Utilities, weight 20 

Government: Governments, v/eight 20 

Research laboratories: Research Facilities, weight 20; Laboratories, weight 9 

Probability d? «tributions: Probability Distribution Functions, weight 20 

Solar Market ..re: see above 

Likelihood: Maximum Likelihood Estimates, weight 20, 'with posting to review as 

over specific buc only valid choice 

Solar Market Share: See above. 

Forecasted: See above. 

Most-likely: posted to new-word list. 
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NASA TERMS Sele tlons 
Exhibit D-5 

Occurrence 

Co-occurring 

Print Terms 

Veiohtr 

Boosts 

Solar energy 

2*20,20,20,20,20 


Electronic Power 

2**20,20,20 


Technological Forecasting 

20,20,20 


Marketing 

2*5, 5, 9, 9, 9 


Mon-Print Terms 

• 


Cost Incentives 

20 

10 

Perve tra t i on 

2*5,9 


Costs 

20 

7 

Envi ror, mental Control 

20 


Utilities 

20 


Probability Distributions 

20 


Maximum Likelihood Estimate 

20 


P»esearch Facilities 

20 


Bayes Theorem 

20 


Government 

9,9 


Laboratories 

9 
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Manually So lectc-j Ter^s 
Exh i b i t D-6 


Print Terms: 


Commen t s _: 


Electric Power 
Market Research 


Solar Energy 

Technological Forecasting 


Common selection. 

Good term, not in reach of present alcorith 
computer chose the general term 'Yiarket i n 
which is inferior. 

Common selection. 

Common term 



Nor-print Terns : 

Bayes Theorem 
Clean energy 
Cost incentives 
Energy technology 


A] so computer located 
Unclear why selected 
Also computer located 

General phrase, computer term and manual 
term are more specific. 


Probability Distribution Functions 


Also computer located 
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k» Findings and Recommendations ORIGINAL PAGE IS 

OF POOR QUALITY 

During the preparation of this paper, a few experimental and operational systems that 
deai with either computer aided indexing or natural language indexing have been iden- 
tified as being of interest. Amongst these are DDC, SSI E , IBM/BROWSER, Wright 
Patterson AFB/CIRCOL, and SMART. Except for the CIRCOL study by WESTAT, none of 
these operational systems has been subject to comparative analysis. The MEDLARS sys- 
tem is currently based on human indexing and has undergone several analytical studies; 
all were designed to find out where the system was failing - not to compare alterna- 
tives. In all cases, the systems are functioning at a presumed satisfactory level 
of cost and performance — although no one has really made a significant attempt to 
evaluate either adequacy of performance relative to less (or more) costly alternatives. 


Based primarily on the ongoing operational success of DDC, SSIE, CIRCOL and BROWSER 
and the systematic studies performed by Lancaster, Cleverdon, and Salton (and others), 
we believe that the publication of computer generated indexes from abstracts and in- 
formation retrieval functions could be accomplished at significantly lower cost with 
negligible overall changes to effectiveness. 
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